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Abstract 

It is by now well-known that wireless networks with file arrivals and departures are stable if one uses a- 
fair congestion control and back-pressure based scheduling and routing. In this paper, we examine whether a-fair 
congestion control is necessary for flow-level stability. We show that stability can be ensured even with very simple 
congestion control mechanisms, such as a fixed window size scheme which limits the maximum number of packets 
that are allowed into the ingress queue of a flow. A key ingredient of our result is the use of the difference between 
the logarithms of queue lengths as the link weights. This result is reminiscent of results in the context of CSMA 
algorithms, but for entirely different reasons. 

I. Introduction 

In order to operate wireless systems efficiently, scheduling algorithms are needed to facilitate simultaneous 
transmissions of different users. Scheduling algorithms for wireless networks have been widely studied since 
Tassiulas and Ephremides [1] proposed the max weight algorithm for single-hop wireless networks and its extension 
to multihop networks using the notion of back-pressure or differential backlog. The back-pressure algorithm (and 
hence, the max weight algorithm) is throughput optimal in the sense that it can stabilize the queues of the network 
for the largest set of arrival rates possible without knowing the actual arrival rates. The back-pressure algorithm 
works under very general conditions but it does not consider flow-level dynamics. It considers packet-level dynamics 
assuming that there is a fixed set of users/flows and packets are generated by each flow according to some stochastic 
process. In real networks, flows arrive randomly to the network, have only a finite amount of data, and depart the 
network after the data transfer is completed. Moreover, there is no notion of congestion control in the back-pressure 
algorithm while most modern communication networks use some congestion control mechanism for fairness purposes 
or to avoid excessive congestion inside the network Q. 

The research was supported in part by ARO MURIs W91 1NF-07-1-0287 and W91 1NF-08-1-0233 and AFOSR MURI FA 9550-10-1-0573. 
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There is a rich body of literature on the packet-level stability of scheduling algorithms, e.g., [fl~), Q, J8), In- 
stability of wireless networks under flow-level dynamics has been studied in, e.g., J2], 0, |4j- Here, by stability, 
we mean that the number of flows in the network and the queue sizes at each node in the network remain finite. 
To achieve flow-level stability, these works use a specific form of congestion control based on a-fair policies; 
specifically, (a) the rates at which flows/files generate packets into their ingress queues maximize the sum-utility 
where each user has a utility function of the form U (x) = x l ~ a / (1 — a) for some a > where x is the flow rate, 
and (b) the scheduling of packets in the network is performed based on the max wight/back-pressure algorithm. 

When there are file/flow arrivals and departures, if the scheduler has access to the total queue length information 
at nodes, then it can use max weight/back-pressure algorithm to achieve throughput optimality, but this information 
is not typically available to the scheduler because it is implemented as part of the MAC layer. Moreover, without 
congestion control, queue sizes at different nodes could be widely different. This could lead to long periods of 
unfairness among flows. 

Therefore, we need to use congestion control to provide better QoS. With congestion control, only a few packets 
from each file are released to the MAC layer at each time instant, and scheduling is done based on these MAC layer 
packets. However, prior work requires that a specific form of congestion control (namely, ingress queue-length based 
rate adaptation based on a-fair utility functions) has to be used. Here we show that, in fact, very general window 
flow-control mechanisms are sufficient to ensure flow-level stability. The result suggests that ingress queue-based 
congestion control is more important than a-fairness to ensure network stability, when congestion control is used 
in conjunction with max weight scheduling/routing. 

In establishing the above result, we have used the max weight algorithm with link weights which are log- 
differentials of MAC-layer queue lengths, i.e., the weight of a link is chosen to be in the form of log(l + 
qi) — log(l + qj) where and qj are MAC-layer queue lengths at nodes i and j. Shorter versions of the results 
presented here appeared earlier in |20l . |22l . |23l . 

The use of logarithmic functions of queue lengths naturally suggests the use of a CSMA-type algorithm to 
implement the scheduling algorithm in a distributed fashion lfT2l . ifTJl . iflOl . The main difference here is that the 
weights are log-differential of queue lengths rather than log of queue lengths themselves. We show that the stability 
results for CSMA without time-scale separation can be extended to the model in this paper with log-differential of 
queue lengths as weights, and the type of congestion control mechanisms considered here. 

At this point, we comment on the differences between our paper and a related model considered in J5). In J5], 
throughput-optimal scheduling algorithms have been derived for a connection-level model of a wireless network 
assuming that each link has access to the number of files waiting at the link. Here, we only use MAC-layer queue 
information which is readily available. Further, Q assumes a time-scale separation between CSMA and the file 
arrival-departure process. Such an assumption is not made in this paper. 

The rest of the paper is organized as follows. In Section [TTJ we describe our models for the wireless network, file 
arrivals, and Transport and MAC layers. We propose our scheduling algorithm in Section [HI] Section [TV] is devoted 
to the formal statement about the throughput-optimality of the algorithm and its proof. In Section [V] we consider 
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the distributed implementation of our algorithm and Section [Vl] contains conclusions. The appendices at the end of 
the paper contain some of the proofs. 

II. System Model 

Model of wireless network 

Consider a multihop wireless network consisting of a set of nodes N = {1,2, .., N} and a set of links C 
between the nodes. There is a link from i to j, i.e., <E C, if transmission from i to j is allowed. Let 

l l = \fMj '■ € £] be the rates according to which links can transmit packets. Let 1Z denote the set of available 
rate vectors (or transmission schedules) r = [r^ : G £]. Note that each transmission schedule r corresponds 
to a set of node power assignments chosen by the network. Also let Co(7Z) denote the convex hall of 1Z which 
corresponds to time-sharing between different rate vectors. Hence, in general, \x € Co(lZ). 

There are a set of users/source nodes U C JV and each user/source transfers data to a destination over a fixed 
route in the network For a user/source u e U, we use d(u)(^= u) to denote its destination. Let V := d(U) denote 
the set of destinations. 

We consider a time-slotted system. At each time slot t, new files can arrive at the source nodes and scheduling 
decisions must be made to deliver the files to destinations in multihop fashion along fixed routes. We use a s (t) to 
denote the number of files that arrive at source s at time t and assume that the process {a s (t);s £ U}{t=i,2,— \ 
is iid over time and independent across users with rate [k s ; s € U] and has bounded second moments. Moreover, 
we assume that there are K possible file types where the files of type i are geometrically distributed with mean 
l/r]i packets. The file arrived at source s can belong to type i with probability p s j, i = 1,2, .., K. Our motivation 
for selecting such a model is due to the large variance distribution of file sizes in the Internet. It is believed 
that, see e.g., lfT31 . that most of bytes are generated by long files while most of the flows are short flows. By 
controlling the probabilities p S i, for the same average file size, we can obtain distributions with very large variance. 
Let m s = Yli^iPsi/fli denote the mean file size at node s, and define the work load at source s by p s = n s m s . 
Let p = [p s : s £ U] be the vector of loads. 

Model of Transport and MAC layers 

Upon arrival of a file at a source Transport layer, a TCP-connection is established that regulates the injection of 
packets into the MAC layer. Once transmission of a file ends, the file departs and the corresponding TCP-connection 
will be closed. The MAC-layer is responsible for making the scheduling decisions to deliver the MAC-layer packets 
to their destinations over their corresponding routes. Each node has a fixed routing table that determines the next 
hop for each destination. 

At each source node, we index the files according to their arriving order such that the index 1 is given to 
the earliest file. This means that once transmission of a file ends, the indices of the remaining files are updated 

'The final results can be extended to case when each source has multiple destinations or to the cases of multi-path routing and adaptive 
routing. Here, to expose the main features, we have considered a simpler model. 
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such that indices again start from 1 and are consecutive. Note that the indexing rule is not part of the algorithm 
implementation and it is used here only for the purpose of analysis. We use W s /(i) to denote the TCP congestion 
window size for file / at source s at time t. Hence, W a f is a time-varying sequence which changes as a result of 
TCP congestion control. If the congestion window of file / is not full, TCP will continue injecting packets from 
the remainder of file / to the congestion window until file / has no packets remaining at the Transport layer or 
the congestion window becomes full. We consider ingress queue-based congestion control meaning that when a 
packet of congestion window departs the ingress queue, it is replaced with a new packet from its corresponding file 
at the Transport layer. It is important to note that the MAC layer does not know the number of remaining packets 
at the Transport layer, so scheduling decisions have to be made based on the MAC-layers information only. It is 
reasonable to assume that 1 < W s f(t) < YV CO ng, i.e., each file has at least one packet waiting to be transferred and 
all congestion window sizes are bounded from above by a constant W cong . 



Routing and queue dynamics 

At the MAC layer of each node n G N, we consider separate queues for the packets of different destinations. 
Let qn , d G T>, denote the packets of destination d at the MAC-layer of n. Also let R^? xAr be the routing matrix 
corresponding to packets of destination d where Bffi = 1 if the next hop of node i for destination d is node j, 
for some j such that G C, and otherwise. Routes are acyclic meaning that each packet eventually reaches 
its destination and leaves the network. A packet of destination d that is transmitted from i to j is removed from 
q^ and added to q^ . Packet that reaches its destination is removed from the network. Note that packets in q^ 
could be generated at node n itself (if n is a source with destination d) or belong to other sources that use n as an 
intermediate relay along their routes to their destinations. 

For the analysis, we also use Q^f* (with capital Q) to denote the total per-destination queues, i.e., Q„ represents 
the packets of destination d at node n, in its MAC or Transport layer. 

For each node n, the MAC (or total) per-destination queues g4 (or Q„ ) fall into three cases: (i) n is source 
and d{n) is its destination, (ii) n is a source but d ^ d(n), and (iii) n is not a source. In the case (i), it is important 
to distinguish between the MAC-layer queue and the total queue associated with d(n), i.e., 

qn ^\ 

because of the existing packets of destination d(n) at the Transport layer of n. However, Qn — qn holds for all 
d G T>\d(n) in case (ii), and for all destinations in case (iii). 

Let Zij(t) denote the number of packets transmitted over link (i, j) G C at time t. Then, the total-queue dynamics 
for a destination d, at each node n, is given by 

N N 
j=l i=l 

where A„ (t) is the total number of packets for destination d that new files bring to node n at time slot t. To 
express one formula for the queue dynamics in all three cases, (i), (ii), and (iii), we can write E A n d \t) = p n d \ 
where := p n in case (i) and := otherwise. 
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Let xff denote the scheduling variable that shows the rate at which the packets of destination d can be forwarded 
over the link Note that z^\t) = min ja^, q^{t)\ , because i cannot send more than its queue content at 

each time. 

The capacity region of the network C is defined as the set of all load vectors p that under which the total-queues 
in the network can be stabilized. Note that under our connection-level model, stability of total-queues will imply 
that the number of files in the network is also stable. It is well-known Q that a vector p belongs to C if and only 
if there exits a transmission rate vector /i G Co(H) such that 

rff > 0; \fd G V and V(i,j) G C, 

N N 

P ( n d) ~ E «S?M$ + E < 0; G P and Vn ^ d, 

j=i »=i 



E/'u ' ^ v(»,j)g/:. 



.(<*) 

/ 

dec 



III. Description of Scheduling Algorithm 

The algorithm is essentially the back-pressure algorithm [1| but it only uses the MAC-layer information. The key 

step in establishing the optimality of such an algorithm is using an appropriate weight function of the MAC-layer 

queues instead of using the total queues. In particular, consider a log-type function 

log(l + a;) 
h[x) 

where h(x) is an arbitrary increasing function which makes g(x) an increasing concave function. Assume that 
h(0) > and g(x) is continuously differentiable on (0, oo): For example, h(x) = log(e + log(l + x)) or h(x) = 
log e (e + x) for some < 9 < 1. For each link with = 1, define 

4f{t):=9{q\ d \t))-9{qf\t)). (2) 

Note that if {d e V : R^f = 1} = 0, then we can remove the link from the network without reducing the 
capacity region since no packets are forwarded over it. So without loss of generality, we assume that {d 6 V : 
Rif = 1} ^ 0, for every G C. Then the scheduling algorithm is as follows: 
At each time t: 



• Each node n observes the MAC-layer queue sizes of itself and its next hop, i.e., for each d G T>, it observes 
qlf and for a j such that R^y = 1. 

• For each link (i, j), define a weight 



and 



Wij(t) := max w\f{t), (3) 



d*At) = argmax wfAt). (4) 
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• The network needs to find the optimal rate vector x* € 1Z that solves 



x 



* (t) = argmax 




njWij(t). 



(5) 



• Finally, assign x\Mt) 



x* if d = d*At), and zero otherwise (break ties at random). 



IV. System Stability 



In this section, we analyze the system and prove its stability under the algorithm described in Section |TTT] The 
following theorem states our main result. 

Theorem 1. For any p strictly inside C, the scheduling algorithm in Section\HI\ can stabilize the network independent 
of transport-layer ingress queue-based congestion control mechanism ( as long as the minimum window size is one 
and the window sizes are bounded) and the (nonidling) service discipline used to transmit packets from active 
nodes. 

Remark 1. Theorem\l\holds even when h = 1 in (Q, however, for the distributed implementation of the algorithm 
in Section we need g to grow slightly slower than log. 

Theorem Q] shows that it is possible to design the ingress queue-based congestion controller regardless of 
the scheduling algorithm implemented in the core network. This will allow using different congestion control 
mechanisms at the edge of the network for different fairness or QoS considerations without need to change the 
scheduling algorithm implemented at internal routers of the network. As we will see, a key ingredient of such 
decomposition is to use difference between the logarithms of queue lengths, as in (0, for the link weights in the 
scheduling algorithm. The rest of this section is devoted to the proof of Theorem [T] 

Order of events 

Since we use a discrete-time model, we have to specify the order in which files/packets arrive and depart, which 
we do below: 

1) At the beginning of each time slot, a scheduling decision is made by the scheduling algorithm. Packets depart 
from the MAC layers of scheduled links. 

2) File arrivals occur next. Once a file arrives, a new TCP connection is set up for that file with an initial 
pre-determined congestion window size. 

3) For each TCP connection, if the congestion window is not full, packets are injected into the MAC layer from 
the Transport layer until the window size is fully used or there is no more packets at the Transport layer. 

We re-index the files at the beginning of each time slot because some files might have been departed during the 



last time slot. 
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State of the system 

Define the state of node n as 

Sn(t) = {(q { n d \t),l^(t)) :deV,(t nf (t),W nf (t),CTnf(t)) : 1 < / < JV„(t)}, 

where N n (t) is the number of existing files at node n at the beginning of time slot t, <r n f(t) £ {I/771, • • • , \jr\K\ 
is its mean size (or type), and W n f(t) is its corresponding congestion window size. Note that a n f(t) is a function 
of time only because of re-indexing since a file might change its index from slot to slot. is an indicator 

function of whether file / has still packets in the Transport layer, i.e., if U n f(t) is the number of remaining packets 
of file / at node n, then 

f n/ (i) = l{U nf (t) > W nf (t)}, 

thus £ n f(~t) = 1> if the last packet of file / has not been injected to the MAC layer of node n, and = 0, 

if there is no remaining packets of file / at the Transport layer of node n. Obviously, if n is not a source node, 
then we can remove (£„/, VV„/, cr n f) from the description of S n . 2A (i) denotes the information required about 
qttf* (t) to serve the MAC-layer packets which depends on the specific service discipline implemented in MAC-layer 
queues. In the rest of the paper, we consider the case of FIFO (First In-First Out) service discipline in MAC-layer 
queues. In this case, ~l\ , (t) is simply the ordering of packets in qn(t) according to their entrance times. As it 
will turn out from the proof, the system stability will hold for any none-idling service discipline. Define the state 
of the system to be S(t) = {S n (t) : n £ Af}. Now, given the scheduling algorithm in section [TTTl and our traffic 
model, S(t) evolves as a discrete-time Markov chain. 

Remark 2. We only require that the congestion window dynamics could be described as a function of queue lengths 
of the network so that the network Markov chain is well-defined. Even in the case that the congestion window is 
a function of the delayed queue lengths of the network up to T time slots before, due to the feedback delay of at 
most T from destination to source, the network state could be modified, to include the queues up to T time slots 
before, so that the same proof technique still applies. 

Next, we analyze the Lyapunov drift to show that the network Markov chain is positive recurrent and, as a result, 
the number of files in the system and queue sizes are stable. 

Lyapunov analysis 

Define Q { n ] {t) := eIq^ (t)\S n (t)} to be the expected total queue length at node n given the state S n (t). Then, 
if n is a source, and d is its destination, 

N n (t) 

Qi d Ht) = qW(t)+ £ [vnfWnf®]. (6) 
/=! 

Otherwise, if d ^ d(n) or n is not a source, then Qn(t) = Qn^it). Note that given the state <S(i), Qn is known. 
The dynamics of Qif\t) involves the dynamics of qn(t), £ n (t), and N n (t), and, thus, it consists of: 
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(i) departure of MAC-layer packets 

(ii) new file arrivals (if n is a source) 

(iii) arrival of packets from previous hops that use n as an intermediate relay to forward packets to their destinations 

(iv) injection of packets into the MAC layer (if n is a source), and 

(v) departure of files from the Transport layer (if n is a source). 
Hence, 

N 

>w-E^ 

3=1 



E* 

i=l 







iW(t) 



(7) 



where A 



(d) 



(t) = Y^Zn (fy+i a nf{t) is the expected number of packet arrivals due to new files, A ( n'{t) is the 



i(<0/ 



total number of packets injected into the MAC layer to fill up the congestion window after scheduling and new file 
arrivals, and D n d \t) — ^^(*)+ a »w a n f(t)I n f(t) is the Transport-layer "expected packet departure" because of 
the MAC-layer injections. Here, I n f{t) = 1 indicates that the last packet of file / leaves the Transport layer during 
time slot t; otherwise, I n f{t) = 0. To notice the difference between the indicators I n f(t) and £ n f(t), consider a 
specific file and assume that its last packet enters the Transport layer at time slot to, departs the Transport layer 
during time slot t\ and departs the MAC layer during time slot ti, then its corresponding indicator J is 1 at time t\ 
and is for to < t < t\ and t\ < t < t<x, while its indicator £ is for all time t\ < t < t-x, and 1 for to < t < t\. 



Note that E 



A {d \t) 



in is the mean packet arrival rate at node n for destination d. Let Bn (t) := A n (t) — 



D { n > (t), and define E 5(t 



\S(t) . It should be clear that when n is a source but d ^ d(n), or when n is 



not a source, A { ^(t) = A n a > \t) = D y n a '(t) 



id). 



lid), 



B {d \t) 



the links in the network. Then Lemma Q] characterizes the first and second moments of Bn (t) 



0. Let r max denote the maximum link capacity over all 

id), 



Lemma 1. For the process {B n d \t)} 



(i) E 5(t) 

(ii) Let rj. 



B n d \t) 



0. 



= nan. 1 < i < K rji, then E 5(t) 
Therefore, we can write 

Q { n\t+l) 



B {d) {tf < ( Kn + N 2 rl ax ) max{W» 1/vLJ 



N 



N 



Qi d) (t) E w ?(*) + WHt) + E 



3 = 1 i=l 

where A n d '(t) := A n (t) + B n d \t). Note that A n d '(t) has mean pffl and finite second moment. 

Let G(u) := J™ g{x)dx for the function g defined in ([T). Then G is a strictly convex function. Consider a 
Lyapunov function 



N 



v(s(t)) = j2Y,G(Q { n ) m 

n=i dec 



Let AV(t) := V(S(t + 1)) - V(S(t)), then, using convexity of G, we get 

N 

Av (t) <EE 9(Qi?Ht + 1)) (Ql d) (t + 1) - (*) 

71=1 del? 

Using the concavity of g and the fact that g' < 1, we have 

l.9(Q^(* + 1)) - g(Q { n d) (t))\ < \Qi d \t + 1) - Ql d) (*)|. 

Furthermore, observe that, based on (0, 

\QW(t + 1) - < Alf\t) + Nr max . 

Hence, 

TV N 

Av (t) <EE + 1) - Q [ n\t)) + E E^ (*) + ^w) 2 . 

n=i der> n=i dev 

Define, ulf 1 (t) :— max Rn] x< nj W — ^ J j to be the wasted service for packets of destination d, i.e., 

when n is included in the schedule but it does not have enough packets of destination d to transmit. Then, we have 



N 



N 



N 



< EE{.9(^ ) w)[E4^L d) w+^Ho-E< ) -S ) w]} 

n=l deV i=l j=l 

N N 

+ E E 9(Q { n } (t))ulf\t) + E E( i « d) ( < ) + Nr max f. 

n=ldeV n=ldeV 

Taking the expectation of both sides, given the state at time t is known, yields 



E, 



S(t) 



N N N 



n=l d6U 



i=l 



E 



AT 



where Ci = E 



n=l deV 

En=iEdev(A { n\t) + Nr max ) 2 \ < oo, because E il d) (t) 2 



< oo. 



Lemma 2. There exists a positive constant Ci such that, for all S (t), 

N 

EE E 5 (t )[3(Ql d) wR d) w] <c 2 . 

n=l deV 

Using Lemma [2] and changing the order of summations, we have 



E. 



S(t) 



Ci + C 2 . 



N 

AV ^} ^ EE^'wVn'-^wf E E4 d) (*)(5(^ d) (*))-5(0f (*))) 

n=idev (ij)ecdev 
Recall that the link weight that is actually used in the algorithm is based on the MAC-layer queues as in ©-(O. 
For the analysis, we also define a new link weight based on the state as 



Wij(t)= max W^{t), 
dev-.R {d) =i 



(8) 
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where, for a link £ C with R^p = 1, 

Wg>(t):=g(Q*(t))-g{Q*(t)). (9) 
Then, the two types of link weights only differ by a constant as stated by the following lemma. 

Lemma 3. Let Wij(t) and Wij(t), E C, be the link weights defined by (|S])-([9|) and d2P _ respectively. Then 

at all times 

Proof: Recall that, at each node n, for all destinations d ^ d(n), we have Q„ = If d = d(n) is the 
destination of n, then consists of: (i) packets of d received from upstream flows that use n as an intermediate 
relay, and (ii) MAC-layer packets received from the files generated at n itself. Since 1 < W n f{t) < Wcong, the 
number of files with destination d that are generated at node n or have packets at node n as an intermediate relay, 
is at most g4 (t). Therefore, it is clear that 



Hence, for all n and d, using a log-type function, as the function g in {TJ, yields 

g{q d n )<g{Q d n) < g{q d n {i + i/n mi n)) 

< l0g((l+g T 1)(l + l/7 ?mm )) 
, d , lQg(l + l/T?min) 

" mn) h(0) 
It then follows that, Vd E £>, and V(i,j) E £ with = 1, 



(10) 



- 4 d) | < log(l + l/rwO/^O). (11) 
Let d*j := argmax^ and djy as in (@}. Then, using ( ITTb . we have that 

Wij > Wif j > Wij - log(l + l/r) min )/h(0), 

and, 

This concludes the proof. ■ 
Let x*(i) be the max weight schedule based on weights {Wij(t) : E £}, i.e., 

= argmax V] lylfy^). (12) 

Note the distinction between x* and x* as we used x*(£) in (0 to denote the Max Weight schedule based on 
MAC-layer queues. Then, the weights of the schedules x* and x* differ only by a constant for all queue values as 
we show next. First note that, from definition of x* , 

]T x^Wait)- J2 2ijWij(t)>0. (13) 
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Next, we have 



E *ZiWij(t)- Y i* 3 W V] {t) = Y, <jWii{t)- Y x ij w iM ( 14 > 
{i,j)ec (i-J)ec (ij)ec (i,j)ec 

+ Y ay «>«(*)- Y XijWijit) ( 15 > 

+ Y Y < 16 > 

< 2N 2 r max log(l + l/uJ^O), 

because, by Lemma [3] ( TBI and (fT6] l are less than vV 2 r max log(l + 1 / r) m in) / h(0) each, and <fT3T > is negative by 
definition of x*. Hence, under MAC scheduling x*, the Lyapunov drift is bounded as follows. 

TV 

E s(t) [AV(t)] < YY {9{Q ( n\t))P { n ] } Es(t) [ Y + C > 

n=idev {i,j)ec 

where C = d + C 2 + 2N 2 r max log(l + l/r) min )/h(0). 

Accordingly, using (O-©, and changing the order of summations in the right hand side of the above inequality 

yields 



N 



N N 

AV ^\ ^ YY\9(Qi d Kms {t) [pi d) +Y R ^ x *in(t)-Y R S 

n=ldeV [ i=l 



3=1 



where x*^(t) — x*j(t) for d = d*j (break ties at random) and is zero otherwise. The rest of the proof is standard. 
Since load p is strictly inside the capacity region, there must exist a e > and a /i G Co (TV) such that 



N 



N 



(17) 



Hence, for any S > 0, 



n = l del? 



JV TV 



ij 21 nj W 



N 



n=l dev 



N 



3=1 



EW?(*)-£j4M(*) 



3=1 



- e EE5(^ ) w) + c Y - 

But E^er^WyW > Ecij-jer^^W. V M G Co(ft), hence, 



E 



s(t) 



n=l deD 



whenever max ni d Qn ' > g^ 1 ( C2 +' 5 ^ or, as a sufficient condition, whenever max nj( j <j4 d ' > g~ 1 ( C2 e +l5 ). Therefore, 
it follows that the system is stable by an extension of the Foster-Lyapunov criteria [16] (Theorem 3.1 in |[T|). In 
particular, queue sizes and the number of files in the system are stable. 
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Remark 3. Although we have assumed that file sizes follow a mixture of geometric distributions, our results also 
hold for the case of bounded file sizes with general distribution. The proof argument for the latter case is obtained 
by minor modifications of the proof presented in this paper (see H221I ) and, hence, has been omitted for brevity. 

V. Distributed Implementation 

The optimal scheduling algorithm in Section [III] requires us to find a maximum weight-type schedule at each 
time, i.e., needs to solve (0 at each time t. This is a formidable task, hence, in this section, we design a distributed 
version of the algorithm based on Glauber Dynamics. 

For simplicity, we consider the following criterion for successful packet reception: Packet transmission over link 
E C is successful if none of the neighbors of node j are transmitting. Furthermore, we assume that every node 
can transmit to at most one node at each time, receive from at most one node at each time, and cannot transmit and 
receive simultaneously (over the same frequency band). This especially models the packet reception in the case that 
the set of neighbors of node i, i.e., C(i) = {j : E £}, is the set of nodes that are within the transmission range 
of node i and the interference caused by node i at all other nodes, except its neighbors, is negligible. Moreover, 
the packet transmission over is usually followed by an ACK transmission from receiver to sender, over 
Hence, for a synchronized data/ACK system, we can define a Conflict Set (CS) for link as 

CS (4J) = {(a, 6) E C : a E C(j), or b E C(i),or a E {i,j}, or b E (18) 

This ensures that when the links in CS(jj) are inactive, the data/ACK transmission over is successful. 

Furthermore, for simplicity, assume that for each link Xij E {0, 1}, i.e, its service rate is one packet per 

time slot. We can capture the interference constraints by using a conflict graph G(V, £), where each vertex in V is 
a communication link in the wireless network. There is an edge (a, b)) E £ between vertices and (a, b) 

if simultaneous transmissions over communication links and (a, b) are not successful. Therefore, at each time 
slot, the active links should form an independent set of Q, i.e., no two scheduled vertices can share an edge in Q. 
Let 1Z be the set of all such feasible schedules and \C\ denote the number of communication links in the wireless 
network. 

We say that a node is active if it is a sender or a receiver for some active link. Inactive nodes can sense the 
wireless medium and know if there is an active node in their neighborhood. This is possible because we use a 
synchronized data/ACK system and detecting active nodes can be performed by sensing the data transmission of 
active senders and sensing the ACK transmission of active receivers. Hence, using such carrier sensing, nodes i 
and j know if the channel is idle, i.e., 2~2(a & ) eC s ( } x a i> = or if the channel is busy, i.e., Yl(a fc ) 6C s ( } x a b > 1. 

Remark 4. For the case of single hop networks, the link weight (fJ} is reduced to Wij(t) = g(l + Qi(t)) / h(qi(t)) 
where i is the source and j is the destination of flow over Such a weight function is exactly the one that under 

which throughput optimality of CSMA has been established in MOV . Next, we will propose a version of CSMA that 
is suitable for the general case of multihop flows and will prove its throughput optimality. The proof uses techniques 
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originally developed in M2V , H13V for continuous-time CSMA algorithms, and adapted in MOV for the discrete-time 
model considered here. 

A. Basic CSMA Algorithm for Multihop Networks 

For our algorithm, based on the MAC layer information, we define a modified weight for each link (i, j) as 

Wij{t)= max w\f{t), (19) 

where 

4?(t) = ~9(tf\t))-~g(qf(t)), (20) 

and, 

~g(q\ d \t)) = max { 5 ,S*(t)} (21) 

where the function g is the same as (Q]) defined for the centralized algorithm, and 

9*(t) ■= -^9(q max (t)), (22) 

where q ma x{t) '■= max^d qf^ (t) is the maximum MAC-layer queue length in the network at time t and assumed 
to be known, and e is an arbitrary small but fixed positive number. Note that if we remove g*(t) from the above 
definition, then Wij is equal to Wij in ©-(O. 

Consider the conflict graph Q(y,£) of the network as defined earlier. At each time slot t, a link is chosen 
uniformly at random, with probability j^, then 

(i) If x a b{t — 1) = for all links (a, b) £ CS( it j\, then iij(t) = 1 with probability Pij(t), and Xijit) = with 
probability 1 — pij (t) . 

Otherwise, Xij(t)=Q. 

(ii) x a b{t) = x a b(t - 1) for all (a,b) ^ (i, j). 

(iii) £y (i) = Xij{t) if d = argmax rf . R w_ x w^\t) (break ties at random), and zero otherwise. 
We choose p^ (t) to be 

1 + CX_p(Wij(t)) 

It turns out that the choice of function g is crucial in establishing the throughput optimality of the algorithm for 
general networks. The following Theorem states the main result regarding the throughput optimality of the basic 
CSMA algorithm. 

Theorem 2. Consider any e > 0. Under the function g specified in (Q, the basic CSMA algorithm can stabilize 
the network for any p G (1 — 3e)C, independent of Transport-layer ingress queue-based congestion control (as long 
as the minimum window size is one and the window sizes are bounded) and the (nonidling) service discipline used 
to serve packets of active queues. 
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B. Distributed Implementation 

The basic algorithm is based on Glauber-Dynamics with one site update at each time. For distributed imple- 
mentation, we need a randomized mechanism to select a link uniformly at each time slot. We use the Q-CSMA 
idea ifTTIl to perform the link selection: Each time slot is divided into a control slot and a data slot. In the control 
slot, nodes exchange short control messages, similar to RTS/CTS packets in IEEE 802.11 protocol, to come up 
with a collision-free decision schedule m. In the data slot, each link that is included in the decision schedule 
performs the basic CSMA algorithm. The control message sent from node j to i in time slot t, contains the carrier 
sense information of node j at time t — 1, and the vector of MAC layer queue sizes of node j at time t, i.e, 
[lj d \t) '■ d S T>], to determine the weight of link 

Next, we describe the mechanisms for generation of decision schedules and data transmission schedules in more 
detail. 

Generation of decision schedule 

As in (Hi, we divide the control slot into two mini slots. In the first mini slot, each node i chooses one of 
its neighbors j G C(i) uniformly at random, then it transmits a RTD (Request-To-Decide) packet, containing the 
ID(index) of node j, with probability fa. If RTD is received successfully by j (i.e., j and none of the neighbors of 
j transmit RTD messages), in the second mini-slot, j sends a CTD {Clear-To-Decide) packet back to i, containing 
the ID of node i. The CTD message is received successfully at i if there is no collision with other CTD messages. 
Given a successful RTD/CTD exchange over the link the link will be included in the decision schedule 

m and no link from CS(j j) will be included in m. Hence, m is a valid schedule. So each node i needs to maintain 
the following memories: 

• ASi(t)/ARi(t): node i is included in m(t) as a sender/receiver for some link. 

• IDi(t): the index of the node which is paired with i as a its sender (when ARj(i) = 1) or its receiver (when 
ASi(t) = 1). 

• NRi(t)/NSi(t): Carrier sense by node i, i.e., node i has an active receiver/sender in it neighborhood during 
data slot t. 

CTD message sent back from a node j to i also contains the carrier sense information of node j, i.e., NRj(t — 1) 
and NSj(t — 1), and the vector of MAC layer queue sizes of node j at time t, i.e, (t) 

Generation of data transmission schedule 

After the control slot, every node i knows if it is included in the decision schedule m(t), as a sender, and also 
knows its corresponding receiver IDi = j. The data transmission schedule at time t, i.e., x(t), is generated based 
on x(t — 1) and m(t). Only those links that are in m(t) can change their states and the state of other links remain 
unchanged. A link that is included in m(t), can start a packet transmission with probability Pij(t) only if its 
conflict set has been silent during the previous time slot, as in the basic CSMA algorithm. 
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Algorithm 1 Decision schedule at control slot t 
1: For every node i, set AS^t) = ARj(i) = 0. 

2: In the first mini-slot: 

- ASi(t) = 1 with probability ft; otherwise ASi(t) = 0. 

-If ASi(t) = 1, choose a node j S C(i) uniformly at random and send a RTD to j and set IDi(t) = j; 
otherwise listen for RTD messages. 
3: In the second mini-slot: 

-If received a RTD from j in the first mini-slot, send a CTD to j and set ARi(t) = 1 and IDi = j; nodes 
with ASi(t) = 1 listen for CTD messages. 

-If ASi(t) = 1 and CTD received successfully from IDi(t), include (i, ID^t)) in m(t), otherwise ASi(t) = 0. 

Algorithm 2 Data transmission schedule at slot t 
1: - V i with ASi(t) = 1 and receiver j = IDi, 

If no links in CS(jj) were active in the previous data slot, i.e., Xij(t— 1) = 1 or NRi(t— 1) = NSj(t— 1) = 0, 

• Xij(t) = 1 with probability Pij(t), 

• Xij = with probability pij(t) = 1 — Pij(t). 
Else (t) = 0. 

- V(i, j) ^ m(t): ii 3 (i) = Xij(t - 1). 

2: In the data slot, use x{t) as the transmission schedule. 



Data transmission and carrier sensing 

In the data slot, we use x(t) for the data transmission. In this slot, every node i will perform of the following. 
Xij(t) = 1: Node i will send a data packet to node j. 

Xji(t) = 1: Node i will send an ACK to node j after receiving a data packet from j. 

All other nodes are inactive and perform carrier sensing. Since the data/ACK transmissions are synchronized 
in our system, every inactive node i will set NSi(t) = is it does not sense any transmission during the data 
transmission period and set NSi(t) = 1 otherwise. Similarly, node i will set NRi(t) = if it senses no signal 
during the ACK transmission period and set NRi(t) = 1 otherwise. 

Remark 5. In IEEE 802.11 DCF, the RTS/CTS exchange is used to reduce the Hidden Terminal Problem. However, 
even with RTS/CTS, the hidden terminal problem can still occur, see M1V . Since, in our synchronized system, RTD 
and CTD messages are sent in two different mini-slots, this completely eliminates the hidden terminal problem. 

Remark 6. To determine the weight at each link, q m ax{t) is also needed. Instead, each node can maintain an 
estimate of q max (t) similar to the procedure suggested in HI 2V . In fact, it is easy to incorporate such a procedure 
in our algorithm because, in the control slot, each node can include its estimate of q ma x (t) in the control messages 
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and update its estimate based on the received control messages. Then we can use Lemma 2 of M2V to complete 
the stability proof. So we do not pursue this issue here. In practical networks ^3-log(l + q m ax(t)) is small and 
we can use the weight function g directly, and thus, there may not be any need to know q m axit). 

Corollary 1. Under the weight function g specified in Theorem^ the distributed algorithm can stabilize the network 
for any p g (1 — 3e)C. 

C. Proof of Throughput Optimality 

Consider the basic CSMA algorithm over a graph Q(V, £). Assume that the weights are constants, i.e., the basic 
algorithm uses a weight vector w = [wij : G C] at all times. Then, the basic algorithm is essentially an 

irreducible, aperiodic, and reversible Markov chain (called Glauber Dynamics) to generate the independent sets of 
Q. So, the state space 1Z consists of all independent sets of Q. The stationary distribution of the chain is given by 

tt(s) = ^exp ( ^2 ™ij)'> s G ^ (24) 

(i,i)es 

where Z is the normalizing constant. 

We start with the following lemma that relates the modified link weight and the original link weight. 

Lemma 4. For all links G C, the link weights ( 1791 ) and (0 differ at most by g*{t), i.e., 

!«>«(*) -«>«(*)! <£*(*)• (25) 

Proof is included in the appendix. The basic algorithm uses a time-varying version of the Glauber dynamics, 
where the weights change with time. This yields a time-inhomogeneous Markov chain but we will see that, for the 
choice of function g in (Q]i, it behaves similarly to the Glauber dynamics. 

Mixing time of Glauber dynamics 

For simplicity, we index the elements of 7Z by 1,2, ...,r, where r = Then, the eigenvalues of the corre- 
sponding transition matrix are ordered in such a way that 

Ai = 1 > A 2 > ... > A r > -1. 

The convergence to steady state distribution is geometric with a rate equal to the second largest eigenvalue modulus 
(SLEM) of the transition matrix lfl4ll . In fact, for any initial probability distribution fio on 1Z, and for all n > 1, 

||MoP"-7r||i < (A*)"||/i -7r||i, (26) 

where A* = max{A2, |A r |} is the SLEM. Note that, by definition, HzHi/tt = ( J2l=i Z (*) 2 ^7J ) 
The following Lemma gives an upper bound on the SLEM A* of Glauber dynamics. 

Lemma 5. For the Glauber Dynamics with the weight vector w on a graph Q(V,£), 
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where w max = max (jj)££ u5y. 

See iflOl for the proof. We define the mixing time as T = y-rpr, so 

T< 16 |£| cx P (4|£|7lw) (27) 

Simple calculation, based on d26l i. reveals that the amount of time needed to get close to the stationary distribution 
is approximately proportional to T. 

A key proposition 

At any time slot t, given the weight vector w(i) = [u>r, (i) : € £>], the MaxWeight-type algorithm, Section 
Hill should solve max s en ^2u j)g s '&ij{P)-> instead, the distributed algorithm tries to simulate a distribution 

7r t ( S ) = ^exp( J2 %(*)); s£ ^> ( 28 > 

i.e., the stationary distribution of Glauber dynamics with the weight vector w(£) at time t. 

Let Pt denote the transition probability matrix of Glauber dynamics with the weight vector w(i). Also let /i t be 
the true probability distribution of the inhomogeneous-time chain, over the set of schedules 1Z, at time t. Therefore, 
we have [it = fit-iPt- Let irt denote the stationary distribution of the time-homogenous Markov chain with P = Pt 
as in (|28| |. By choosing proper g* and g(-), we aim to ensure that /i t and 7r t are close enough, i.e., \\ir t — jUt||rv < 8 
for some <5 arbitrary small, where \\tt — (j,\\tv = h S[=i I^C*) — A*(*)l- Note that — 7r||j. > 2\\fi — tt\\tv- Next, 
we characterize the amount of change in the stationary distribution as a result of queue evolutions. 

Lemma 6. For any schedule s 6 K, e~ at < < e Qt , where, 

a t = 2(1 + W amg )\£\g'(g- 1 (g*(t + 1)) - 1 - W co „ ff ), (29) 
a«(i W C ong is the maximum congestion window size. 

Now, equipped with Lemmas [5] and [6] we make use of the results in lfl2l . lfl3l and IflOl in the final proof. 
Specifically, we will use the following key Proposition from fTOl which we have included a proof for it in the 
appendix for completeness. 

Proposition 1. Given any 5 > 0, \\"Kt ~ ^||tv < <5/4 holds when q m ax(t) > qth + t*, if there exists a qth such 
that 

o-tTt+i < S/16 whenever q max (t) > q t h, (30) 

where 

(i) T t < 16l £ lexp(4|/:|w ma:c (t)) 

(ii) t* is the smallest t such that 

1 tl+t " 1 

l . ex P (- Yl 7f* ) < V4, (3D 

Vmm s 7r tl (s) k=ti J- k 



where q max (ti) = qth- 

In other words, Proposition Q] states that when queue lengths are large, the observed distribution of the schedules 
is close to the desired stationary distribution. The key idea in the proof is that the weights change at the rate at 
while the system responds to these changes at the rate l/T t+ i. Condition (130b is to ensure that the weight dynamics 
are slow enough compared to response time of the chain such that it remains close to its equilibrium (stationary 
distribution). 

We will also use the following lemma that relates the maximum queue length and the maximum weight in the 
network. Hence, when one grows, the other one increases as well. 

Lemma 7. Let w max (t) = max(jj) 6 £Wij(i). Then 

(*)) (*) <g(q 

max 

(*))■ 

Some useful results for the basic CSMA algorithm 

Roughly speaking, since the mixing time T is exponential in g(q m ax), g'(g~ 1 (g*)) must be in the form of 
e~ 9 ; otherwise it will be impossible to satisfy a t T t +i < 8/16 in Proposition Q] for any arbitrarily small 5 as 
qmax(t) — > oo. The only function with such a property is the log(-) function. In fact, g must grow slightly slower 
than log(-) to satisfy d30l l. and to ensure the existence of a finite t* in LemmaQ] For example, by choosing functions 
that grow much slower than log(l + x), like h(x) = log(e + log(l + x)), we can make g{x) behave approximately 
like log(l + x) for large ranges of x (correspondingly, for the range of practical queue lengthes). More accurately, 
we state the result as the following lemma whose proof can be found in the appendix. 

Lemma 8. The Basic CSMA algorithm, with function g as in (fJJ), satisfies the requirements of Proposition [7] 

Next, the following Lemma states that, with high probability, the basic CSMA algorithm chooses schedules that 
their weights are close to the Max Wight schedule. 

Lemma 9. The basic CSMA algorithm has the following property: Given any < e < 1 and < 6 < 1, there 
exists a B(S, e) > such that whenever q m ax{t) > B{5, e), with probability larger than 1 — 8, it chooses a schedule 
s(t) G 1Z that satisfies 

V w i:j (t) > (1 -e)max V Wij(t). 
(ij)es(t) (i,j)es 

Proof: Let w*(t) = max s g7j Y^u j)^s w ij (*) an ^ define 



[sen-. J2 < (l-eK(i)}' 



Xt 

Therefore, we need to show that pt(xt) < S, for q ma x(t) large enough. For our choice of g(-) and g*, it follows from 



Proposition [T] that, whenever q m ax(t) > qth + t*, 2||/i t — 7r t ||Ty < 6/2, and consequently, J2 seK A*t( s ) — ^tis) 



< 
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6/2. Thus, 



sSXt 



sex* 



Therefore, to ensure that Ssext ^ t '( s ) — ^> ^ su ffi ces to have Ylsext 7r *( s ) — ^/^" ^ ut ' Lemma [4] Wij(t) < 



i(t) + <?*(*), so, 



5>w < ^l e ^^(*) e i^(*)<^i 



e (l- £ ) W *(t) e |£| ff *(t) 



sex* 



«6Xt 



■sexf 



and 



Therefore, 



sex* 

when q m ax(t) > qth +t*. Note that > w max {t) > g(q ma x(t))/N, and g*(t) = 4^£r(g m oa;(*))> so 



^ 7r t (s) < 2 Ar2 e'57v ff(9 — (t)) < 5/2 
sex* 



whenever q ma x(t) > B(S,e), where 



/2iV 2 
B(<5, e) = max <j fth + f, .g" 1 ( — (iV 2 log 2 + log -) 



Throughput optimality 

Now we are ready to prove the throughput optimality for the basic CSMA algorithm. Let x* and x* be the optimal 
schedules based on total queues and MAC queues respectively, given by ( flZb and (0, and x be the schedule generated 
by the basic CSMA algorithm. The proof is parallel to the argument for the throughput optimality of the centralized 
algorithm. Especially, the inequality ^ still holds, i.e., 



N 



E 



s(t) 



('/) 



Next, observe that 



E ^wii(t)-E S(t) [ e *«w«(o 



E, 



[ E s&WtfW- E x ly »"«(*) 

(»j)e£ (i,i)e£ 



E 



S(t) 



s(t)[ E Xijw^t)- e %^(*) 



(32) 

(33) 
(34) 
(35) 
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Now note that each of the terms ( f33l > and d35t is less than \C\ log(l + 1 / r] m i n ) / h(0) by Lemma ((3). The term ( |34) 
is bounded from above, by using Lemma [9] as follows. 

(El < J2 x* jWij {t)-{l-5){l-e) J2 

< Xi j v>ii(t)-0.-S)(l-e) E 

< (l-(l-<5)(l-e)) + \C\ \og{l + l/r, min )/h{Q), 

whenever q ma x(t) > B(5, e), for any S > 0. Thus, using the above bounds for terms 133) . d34) and ((35), we get 



E ^ (i-5)(i-e) E syWy^-siriioga + i/^VMo). 



(36) 



Using ((36) in ((32) yields 



(ij")££ 



AT 



Es Ct) [A7(*)J < J2J2g(QW(t))pW-(l-5)(l-e) E 4^W + C 3, (37) 

n=idev (i.j)ec 

where C3 := C\ +C2 + 3|£| log(l + 1 / rj min ) / h(0) . Using ((8) and rewriting the right-hand-side of ((37) by changing 
the order of summations yields 

N N N 



E 



S(t) 



C 3 



AV (t)} < E E 9(Q^(t))[ P ^ + a - 5) (i - ^(Eifrt^) -E<^*S(^ 

n=l del? i=l j = l 

whenever q ma x(t) > B(5, e). The rest of the proof is standard. For any load p strictly inside (1 — 3e)C, there must 
exist a/j£ Co{K) such that for all 1 < n < N, and all deV, 



N N 

^ ) <(l-3e)(E<VS ) -E^M ) 
j=i i=i 



(38) 



Let p* = (1 - 3e) min„ eA r !de -D ( £\ R n - P nj - R L Vin ) for some positive p*. Hence, 



E 



s(t) 



avw] < (i-^)(i-e)EE{^Ho)[E^^% ( nW-E< ) ^S ) ( < )]} 

Ti^ideri i=i j=i 

AT AT AT 

+ (i - 3e ) E E [E «SMJ - E } + c- 

n=ldeT> j=l i=l 

For any fixed small e > 0, we can choose <5 < e/(l — e) to ensure (1 — <5)(1 — e) > 1 — 2e. Moreover, from definition 
of x*(t) and convexity of Co(lZ), it follows that 



A 



N 



N 



N 



N 



N 



E E 9(Q { : l Ht)) [E «SMJ(«) - E «SM?(*)] > E E ^(*)) [E </4J - E * 

n=ldeT> j=l i=l n=ldeX> j=l i=l 

for any p £ Co (7?.). Hence, 



(<*) (<*) 
in r^m 



, (39) 



E 



S(t) 



AV(t) 



< -e 



N u\ N 

( V - Vi? (t V d) 



:EE^ d) w)[E«: 

n=lde2? j = l i=l 

Ar 



c 3 



n=l deD 
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whenever maxn^Qn > 9 1 ( c '^t e 1 e 3e ) and q m ax{t) > B(8,e) or, as a sufficient condition, whenever 

g m ax(t) > max|B((5,e),5 _1 (^_i-iL_^!)| . 

In particular, to get negative drift, — e', for some positive constant e', it suffices that 

max N n > max { g _1 (^_Lf —\ B(8, e) 

n L P* e 

because q m ax{t) > max„ iV„, and 3 is an increasing function. This concludes the proof of the main theorem. 

Extension of the proof to the distributed implementation 

The distributed algorithm is based on multiple site-update (or parallel operating) Glauber dynamics as defined 
next. Consider the graph Q(V,£) as before and a constant weight vector w = [wij : 6 C], At each time t, a 
decision schedule m(t) C 1Z is selected at random with positive probability a(m(t)). Then, for all £ m(t), 
we perform the regular Glauber dynamics. Then, it can be shown that the Markov chain is reversible, it has the 
same stationary distribution as the regular Glauber dynamics in d28t , and its mixing time is almost the same as 
(127) . In fact, the mixing time of the chain is characterized by the following Lemma. 

Lemma 10. For the multiple site-update Glauber Dynamics with the weight vector W on a graph Q(V,£), 

64l y l 

T< — exp(4|V|«w). (40) 

where w max = max ieV | u5». 

See iflOl for the proof. The rest of the analysis is the same as the argument for the basic algorithm. The distributed 
algorithm uses a time-varying version of the multiple-site update Glauber dynamics, where the weights change with 
time. Although the upperbound of Lemma [TO] is loose, it is sufficient to prove the optimality of the algorithm. 

Finally, let Tdata and T contr oi denote the lengths of the data slot and the control slot. Thus, the distributed 
algorithm can achieve a fraction T^ata/ (Tdata + T contr oi) of the capacity region. In particular, in our algorithm, 
it suffices to allocate two short mini-slots at the beginning of the slot for the purpose of control. By choosing the 
data slot to be much larger than the control slot, the algorithm can approach the full capacity. 

VI. Conclusions 

Design of efficient scheduling and congestion control algorithms can be decoupled by using MAC-layer queues 
for the scheduling of packets and using window-based congestion control mechanisms for controlling the rate at 
which packets injected into the network. This separation result is very appealing to the network designer. It is 
also important from the practical perspective because, typically, only the MAC-layer information is available to the 
scheduler since it is implemented as part of the MAC layer. Moreover, window-based congestion control is also 
more consistent with practical implementations like TCP. 
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Appendix A 
Proof of LemmaQ] 

Let A^j(t) denote the number of packets of file / injected into the MAC layer of node n, and D^j{t) = 
a n f(t)I n f(t) denote the expected "packet departure" of file / from the transport layer. Let B n j(t) — A^j(t) — 
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D<${t) for file /. 

Part (i): It suffices to show that for each individual file 1 < / < N n (t), Es(t) B n f(t) = 0. We only need to 
focus on files / with £ n f(t) = 1, i.e., existing files in the Transport layer, or new files, i.e, / 6 (N n (t) + 1, N n (t) + 
ftn(£))> because the E^^) B n f(t) = if file / has no packets in the Transport layer. 

Let yV^f W b e me remaining window size of file / at node n after MAC-layer departure but before the MAC-layer 
injection. We want to show that, for any w > 0, 



L s(t) 



B nf (t)W r nf (t)=w 



0. 



(41) 



then (EJ implies E S (t) B nf {t) = 0. 

Because the number of remaining packets at the Transport layer at each time is geometrically distributed with 
mean size a n f(t), the transport layer will continue to inject packets into the MAC layer with probability 7 n /(i) = 
1 — \/a n f{t) = 1 — Tj n f{£) as long as all previous packets are successfully injected and the window size is not full. 

Clearly, if w — 0, no packet can be injected into the MAC layer. Therefore, A^j(t) = and D^j(t) = 0, and 
(PiTt is satisfied. Next, we consider the case when w > 0. Let p w {k, j) denote the probability that A^J(t) = k and 
I n f(t) = j £ {0, 1} given that W^t(t) = w. Because transport-layer packets are injected into the MAC layer as 
long as the window is not full, we have p w (k, 0) = for k < w. Obviously, p w (k, 1) = for k > w. 

3< 



The probability that A^l (t) = k where k < w directly follows the geometric distribution of the remaining 



packets of file /, i.e., 

Pw (k, i) = p (iS(t) = k,i nf (t) = i\w; lf (t) = w 

= F(Ai d }(t) = k\W r nf (t) = w) 

= l k nfm-1nf{t)), 

for 1 < k < w. Note that from the definition of / n /(t), we have 

w 

P (I nf (t) = 0\W r nf (t) =w)=l- J2P™( k > 1) = 7£(<)- 
Now we calculate the left-hand side of (Hll . 



k=l 



E. 



SU) 



B nf (t)\W r nf (t)=w 



J2p w (k,l)(k-a nf ) +¥(l nf (t) = 0\W^ f (t) = w) 

k=l 

w 

d 



k=l 



= (l-7n/) 



0. 



dlnf 



i r - >y w+1 
' n f Inf 



C 1 - 7n/)°""/ + w lnf 
1 — 'v w . 

Inf 



Inf 



1 



Inf 



Part (ii): From the definition of B n (t), we have 

JV„(i) JV„(t)+o„(t) 

B B (*) - £ S n/ (i)+ E 

/=i /=JV„(t)+l 
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Using the fact that new arriving files are mutually independent, and are also independent of current network state, 
we have 

N n (t) „ A r „(t)+a„(t) 



3 



5(t) 



B n (ty 



/=1 



E B */W 5 

/=JV„(t)+l 



(42) 



where we have also used the fact that E s(t) [s n /(*)] = 0. Note that B nf (t) 2 < max{i|^(i) 2 , /^(i) 2 }. So, 
based on the assumption that the congestion window size is bounded by W cong and the mean file size is bounded 
by l/ri m i n , we have E 5 ( t ) B n f(t) 2 \ < max{VV 2 , l/rf nin }- Therefore, the second term in d42l is bounded by 



N n (t)+a n (t) 

E S(t) [ E B «/ W 2 ] < K » maX {Wc « S , 1/lL}' 
f=N n (t) + l 



(43) 



Next, we bound the first term in (1421 1. Let .F n (t) denote the set of files at node n that are served at time t. Because 
B n f(t) = if the existing file is not served, we have 



N n {t) 



£ fl n/ (t)| < max{ ]T %3(t), E 

f£Fn(t) feFn(t) 

< (^(t)! ■ max|w co „ g , l/77 min |. 



Note that \F n {t)\ < Y^j-( n j)e£ x nj(t) — Nr max because the number of existing files that are served cannot 
exceed the sum of outgoing link capacities. Thus, 



ni(t) 



[(E B </W 



/=1 



< N 2 r 2 „^, max 



|W 2 l/?7 2 \ 



(44) 



Substituting (|43T > and (l44t into (l42l completes the proof. 



Note that = if q)?\t) > Nr max , and u£>{t) < Nr max if gW(i) < Nr. 



Appendix B 
Proof of Lemma|2] 



In the latter case, 



since the congestion window size for every file is at least one, we know that there are at most Nr max files in 
transport layer of node n intended for destination d. Hence, based on the definition of Q„ (t), Qn (t) > Q a := 
Nr max ~\~ Nv raax I Tj m i n . So, 



E 



S(t) [g (Qi d Ht))ui d \t)\ = E s(f) [g(Q^(t))u^(t)l {$\t) < Nr max ) 

< Es(t) [g(Q { nHt))Nr max t | g W)(t) < Nr max } 

< Nr max g{Q ) 

Therefore, the result follows by choosing C2 = N 3 r max g(Nr max (l + l/r) m in))- 
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Appendix C 
Proof of Lemma|4] 

It is sufficient to prove that for all d G T>, w\ d it) — g*(t) < w^\t) < (t) + g*{t) as we do now. 

9(^\t))+g*(t)- g ( q \ d Ht)) 



-(d) , 

wh < max • 



< 



Similarly, 



~{d) . 

Win > 



I J 



> 



wi?(t) + g*(t). 



g(qW(t))-wtt{g(qW(t)),g*(t)} 

g(^%))-sr(t)-g(^(t)) 

v>l?(t)-g*(t). 



Note that 



where 



7r«+i(a) 

71* (s) 



Appendix D 
Proof of Lemma|6] 



~r~ cx p ( ^ ( t + 1 )~ w v (*)) 
j t+i v 



— ~ /T~r~rv\ < maxexp > io,j t - Wij(t + 1 



< exp( ^ + 
Let q*(t) denote g- l {g*(t)), and define j^ d) (i) := max{g*(t), g| d ) {€)}. Hence, 



S^ft + 1)) - g{tf\t + 1)) - g$ d) (t)) + g($>(t)) 

= ~g(g( d \t + l))-g(<£ d \t))\ + \g^f{t))-g^f\ t+ l)) 

< g'{qf\t))tif\t + 1) - m) + 9\t\t + mof\t) m + !))> 

where the last inequality follows from the fact that g is a concave and increasing function. If we assume that link 
service rate is at most one and the congestion window sizes are at most W cong , then for all i e M and for all 

deV, \qf\t + 1) - $ d \t)\ < 1 + W cong . Hence, 



\4f(t+l)-4f(t)\ 



< g\qf\t)) +g'{of\t + 1)) < 2g'(q*(t + l) l-W cong ), 



cong 
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and thus, 



Similarly, 



7r «+ 1 ( S ) < e 2(l+W cons )\C\g' (q' (t+l)-l-W cong ) 
7T t (s) 



This concludes the proof. 

Appendix E 
Proof of Lemma|7] 

The second inequality immediately follows from definition of Wij. To prove the first inequality, consider a 

3 V"/ ■ '-"13 



destination d, with routing matrix R^ G {0, l}""", and let w( d) = [w$(t) : B.[f = 1], then, based on ©, we 



have 



wM = (I-R (d) ) 3 (q (d) ), 



where g(q^) = [.g(^ ) : i G A/]. Note that every row of R^ d ^ has exactly one "1" entry except the row 

corresponding to d which is all zero, so (R^) N = 0. Therefore, (I - R^')" 1 = I + + (R^) 2 H exists 

and I — R^ d ) is nonsingular (Similar to the argument in page 222 of ET\ ). So <7(q^) = (I — R' d ') _1 w( d ). Let 
II • I loo denote the oo-norm. Then we have 



JY 



N 



N 



(I-R W ) 



E( R(d) ) fe iu < E ii( R(d) ) fc ii°o < E n R(d) ii- < n 



k=0 



fe=0 



fc=0 



where we have used the basic properties of the matrix norm, and the fact that HR^'Hoo = 1. Therefore, 



(d)) 



< llfl-R^rMUIw^lU <N\\wW\ 



for every d G T>. Taking the maximum over all d G £>, and noting that g is a strictly increasing function, yields the 
result. 



Appendix F 
Proof of Lemma[8] 

h is strictly increasing so h(x) > 1 for all x > So 

g'(x) < — !— for x > h^m. 
1 + x 

The inverse of g cannot be expressed explicitly, however, it satisfies 

g^{x) = exp(xh(g^ 1 (x))) - 1. 

Therefore, 

2(l + W cong )\£\ _ 2{l + W cong )\C\ 



g-Kg*) - W cong exp&hig-^g*))) - 1 - W, 



(45) 



(46) 



(47) 



cong 
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for .g* >g{l + W cong + h- l {l)). 
Next, note that 

T t+1 < 16 \c\ e MC\( Wmax+g n 

< i6l £ l e 4|£|(s(9ma:E)+ 3^ ff(9mt " )) 

< 16 |£| e 8|C|9( 9m o,)^ (4g) 

Consider the product of d47j and (HSJ) and let K := 2(W con g + 1)|£|16 |£| . Using (O and l|22t . the condition d30b 
is satisfied if 

Ke9 '[^^-H9-\ 9 '))] ( 1 + Z+^f ) < J/16. (49) 

Consider fixed, but arbitrary, |£|, AT and e. As q max — > oo, g(q ma x) — > oo, and consequently g* — > oo and 

g^ 1 (g*) ~> 00 ■ Therefore, the exponent 32 ^^ Af h(g^ 1 (g*)) is negative for g rnax large enough, and thus, there 

is a threshold gt/j such that for all q max > qth, the condition d49l is satisfied. 

The last step of the proof is to determine t*. Let t\ be the first time that q ma x(t) hits qth, then 



ti+t -, ti+t 

.(*)) 

fc=t! fc fc = ti 



£ JL > 16 -2|£| e ^ 16 l £ lf(« 
fc=ti 

i6- 2 i £ i x; e - i6|£| 

ti+t 

16 -2|£| 5^(l + g maa (t))-SHS 
fc=ti 



ti+t 



Tin / jgia 
> 16~ 2|£| i(l + g t/i + £) "^J 



and 



1 

miri7r tl (s) > 



E 5 e xp(E ies ^(*i)) 

> 



/ G .s 

1 



> 



> 



\U\ exp(\£\w ma x(h)) 
1 

|?e|exp(|>C|(«; maa! (ii) + ff*(ti))) 
1 



2 N2 exp(2N2g(q th )) 
Therefore, by Proposition [TJ it suffices to find the smallest t that satisfies 

16- 2N2 t(l + q th + t)-^£ > log{4/5)+N 2 \og(2{l + q th )) 
for a threshold gt/j large enough. Recall that h(.) is an increasing function, therefore, by choosing qth large enough, 

16Af 2 



Kith. 



can be made arbitrary small. Then a finite t* always exists since 

_ 16N 2 

lim t*(l + q th + t*) h ^*h) = oo. 

t* — >oo 
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Appendix G 
Proof of ProposftionQ] 

The drift in ixt is given by 

IK+.-ll?^, - 1^.-111^,. S^W(^gj-i)' 

< max{(e Qt - l) 2 , (1 - e^f} = (e Qt - if 

for at < 1 where at is given by (f29b . Thus, ||7rt+i — 7Tf||i/ Wt+1 < 2a* for at < 1. The distance between the true 
distribution and the stationary distribution at time t can be bounded as follows. First, by triangle inequality, 

Ha** - 7rt||i/7rj < \\nt - n-i\\i/n t + IKt-i - ft Him < IIm* - ^-llli/^ + 2a t -i. 

On the other hand, 

ll^-Tt-i||? A( = E -tt(^( s ) - ^-i W) 2 = E \ ^ - ^-i(«)) 2 

' 4 ^ 7T t (s) ^ 7T t (s) 7T t _l(s) 

< e^-MI/xt-Trt-xll?/^. 

Therefore, for a t _i < 1, 

||— -l|k t < (l + Q!t-l)||Mt - 7T t _i||i/ 7rt _ 1 +2a t _i. 

Suppose a t < (5/16, then ||^ - < 8/2 holds for t > t* , if 

llMt — TTt-iHi/TTt-i < 5/4 
for all i > t*. Define a< := ||/it+i — i"* || i/7r t - Then 

Ot+l = i|/it+2 - 7Tt+l||l/7r t+1 = ]|Mt+l-Pt+l — TTt+xHl/n+i - ^*+lllMi+l ~ 1"t+l||l/7r t+1 

where A^ +1 is the SLEM of P t +\. Therefore, 

at+i < A* +1 [(1 + a t )a t + 2a t ]. 
Suppose at < 5/4. Defining T t = , we have 

otfi < (l-=^-)[d/4+(2 + d/4)a t ]. 
J t+i 

Thus, a t+ i < (5/4, if 

(2 + <5/4)a t < -^((5/4 + (2 + 5/4)a t ), 
J-t+i 

or equivalently if a t < {2+ s/ 4 )(i-i/T t+1 ) - But 

T t+ i ^ T t+ i 1 



(2 + (5/4)(l-l/T t+ i) 4(1-1/T t+1 ) 16T t+1 
so, it is sufficient to have 



a t Tt+i < (5/16. (50) 
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Therefore, if there exists a time t* such that a t * < <5/4, then a t < (5/4 for all t > t* . To find t*, note that a t > 6/4 
for all t < t*. So, for t < t* , we have 

a t < (1- — )[(l + a t -i)a t -i + 2a 4 _i] < (1-— )[(l + a i _i)a*_i+2a t _i4^i] 

l w 8 l w 5/16, 8 « « 

< (1 - — )(1 + a t -i + -jOt-Jot-i < (1 - + + ^))at-i 



< (1 - i)(l + i)at_i = (1 - =)at_i < e ^a t _ x . 
J t J t 1 t 



Thus, at < aoe T ^ , where 



a = ||— - 1||tto = ||MO-Po -TToHi/tto 
TO 



< A*(P )||/io-7r ||i /wo < 



1 



min s 7ro(s) 

Finally, assume that ( f50b holds only when q max (t) > <Ztfc for a constant > 0. Let t\ be the first time that 
Qmax{t) hits q t h- Then, after that, it takes t* time slots for the chain to get close to Wt if q m ax(t) remains above 
qth for ti < t < t\ + t*. Alternatively, we can say that 1 1 vr* — HtWrv < <5/4 if q m ax(t) > qth + 1* since at each 
time slot at most one departure can happen and this guarantees that q ma x(t) > qth for, at least, the past t* time 
slots. This immediately implies the final result in the proposition. 



